Skip to content

Conversation

alexbaden
Copy link
Contributor

@alexbaden alexbaden commented Jun 13, 2025

This PR marks loads with Subgroup 2D Block Encoding as expensive loads so the layout will be preserved. The code in LoadStoreOpToLLVM::rewriteTensorPointer is modified to support loads with either the block io tag and DPAS layout, or loads with Subgroup 2D Block Encoding. Both are handled similarly with implicit conversion from the loaded values in registers to the DPAS layout via register shuffles. The ConvertLayout op left in the Subgroup 2D Block Encoding case is deleted.

I have one issue where I seem to be inserting a barrier somewhere which is causing a performance regression. I need to determine why that is happening, then this should be ready for review.

close #4499

depends on #4463

depends on #4510

@alexbaden alexbaden changed the title Alex/use subgroup 2d block encoding pr Use the Subgroup 2D Block Encoding in LoadStoreOpToLLVM Jun 13, 2025
@alexbaden alexbaden force-pushed the alex/use_subgroup_2d_block_encoding_pr branch from 8e3fbd0 to 7e5b52b Compare June 17, 2025 18:22
@alexbaden alexbaden requested review from etiotto, whitneywhtsang, chengjunlu, a team, ienkovich and kurapov-peter and removed request for etiotto and whitneywhtsang June 18, 2025 13:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Replace block_io with subgroup 2d block encoding in LoadStoreOpToLLVM for Block Ptr loads
1 participant